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Question 4 


Intent of Question 





The primary goals of this question were to assess a student's ability to (1) describe why the median might 
be preferred to the mean in a particular context; (2) compare the relative merits of two sampling plans; and 
(3) describe a consequence of nonresponse in a particular study. 


Solution 


Part (a): 
The median is less affected by skewness and outliers than the mean. With a variable such as income, a 
small number of very large incomes could dramatically increase the mean but not the median. 
Therefore, the median would provide a better estimate of a typical income value. 





Part (b): 


Method 2 is better than Method 1. A sample obtained from Method 1 could be biased because of the 
voluntary nature of the response. It is plausible that class members with larger incomes might be more 
likely to return the form than class members with smaller incomes. The mean income for such a sample 
would overestimate the mean income of all class members. With Method 2, despite the smaller sample 
size, the random selection is likely to result in a sample that is more representative of the entire class 
and produce an unbiased estimate of mean yearly income of all class members. 
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Question 4 (continued) 


Scoring 


This question is scored in three sections. Part (b) has two components: (1) identifying a relevant 
characteristic for each sampling method; (2) indicating the effect of the biased method on the estimate of 
the mean income. Section 1 consists of Part (a); section 2 consists of part (b), component 1; and section 3 
consists of part (b), component 2. Sections 1, 2, and 3 are scored as essentially correct (EF), partially correct 
(P), or incorrect (I). 


Section 1 is scored as follows: 


Essentially correct (EF) if the response includes the following two components: 
1. Describes how skewness or outliers affect the mean or do not affect the median. 
2. Makes a conjecture about a relevant characteristic of the distribution of incomes, such as 
skewness or an outlier. 


Partially correct (P) if the response includes only one of the two components listed above. 
Incorrect (I) if the response does not meet the criteria for E or P. 


Notes: 
e For Component 1, examples of responses that are acceptable include: 


4 


o The mean is affected by skewness (outliers). 


7 


o The median is not affected by skewness (outliers). 


nq 


o The mean is greater (less) than the median when there is right (left) skewness or 
outliers. 





e For Component 1, examples of responses that are not acceptable include: 
o Don’t use the mean for skewed distributions or distributions with outliers. 
o Use the median for skewed distributions or distributions with outliers. 
o Responses that include an incorrect statement about means and/or medians, such as 
for right skewed distributions, the median will be higher than the mean. 
e Itis possible to satisfy both components with a single sentence, such as, “If there was a 
billionaire in the sample, the mean would be higher than the median.” 
e Ifaresponse argues that using the mean is a more appropriate way to estimate the typical 
income, then reduce the score in section 1 by one level (that is, from E to P or from P to J). 





Section 2 is scored as follows: 


Essentially correct (E) if the response chooses Method 2 AND includes the following two components: 
1. Identifies a relevant characteristic of Method 1. 
2. Identifies a relevant characteristic of Method 2. 


Partially correct (P) if the response chooses Method 2 AND includes only one of the two components 
listed above 

OR 
if the response includes both components but does not choose a method. 


Incorrect (I) if the response chooses Method 1 OR otherwise does not meet the criteria for E or P. 
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Question 4 (continued) 


Notes: 


e Responses that do not explicitly choose Method 2 can still earn an E for section 2 if the choice is 
clearly implied. The choice of Method 2 is clearly implied if the response only discusses negative 
characteristics of Method 1 and only discusses positive characteristics of Method 2, such as, 
Method 1 is biased but Method 2 uses a random sample. 

e Responses that compare the two methods can satisfy both components, such as, saying that 
Method 1 is more biased or that nonresponse will be less of a problem with Method 2. 

e Responses that refer to the nonresponse bias as voluntary response bias, response bias, 
undercoverage can still earn an E. 


e Discussions of conditions for inference should be considered extraneous and ignored. 


Section 3 is scored as follows: 


Essentially correct (E) if the response includes the following two components: 
1. Indicates the incomes of responders may be different from the incomes of nonresponders. 
2. Indicates the biased sampling method may produce a misleading estimate/conclusion about 
the mean income, including direction, for example, “The sample mean is likely to be higher 
than the mean of the population.” 


Partially correct (P) if the response provides only one of the two components listed above. 
Incorrect (I) if the response does not meet the criteria for E or P. 


Notes: 

e Asingle sentence can satisfy the first component of section 2 and the first component of 
section 3. (For example, “In method 1, rich people are more likely to respond.”) 

e For component 2, either direction is acceptable but the direction must be consistent with the 
identified bias. Saying only that Method 2 will be more accurate or more representative does 
not satisfy component 2. 

e Ifaresponse addresses possible nonresponse bias in Method 2, the response can still satisfy 
both components of section 3. 

e Responses that focus on the larger sample size in Method 1 can satisfy component 2 if such 
responses describe the effect as reducing the variability of the estimate. (For example, “I 
would use Method 1 since the larger sample size would give less variability of the mean.”) 

e Responses that focus on untruthful survey answers can satisfy component 2 if the effect on the 
estimate is appropriate. (For example, “People contacted in Method 2 might say they make 
more money than they actually do. This would make the estimated mean too high.”) 
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Question 4 (continued) 

Complete Response 

All three sections essentially correct 
Substantial Response 

Two sections essentially correct and one section partially correct 
Developing Response 

Two sections essentially correct and one section incorrect 
OR 

One section essentially correct and one or two sections partially correct 
OR 

Three sections partially correct 
Minimal Response 

One section essentially correct and two sections incorrect 


OR 
Two sections partially correct and one section incorrect 
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4. As part of its twenty-fifth reunion celebration, the class of 1988 (students who graduated in 1988) at a state 
university held a reception on campus. In an informal survey, the director of alumni development asked 50 of 
the attendees about their incomes. The director computed the mean income of the 50 attendees to be $189,952, 
In a news release, the director announced, “The members of our class of 1988 enjoyed resounding success. 
Last year’s mean income of its members was $189,952!” 


(a) What would be a statistical advantage of using the median of the reported incomes, rather than the mean, 
as the estimate of the typical income? 
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(b) The director felt the members who attended the Teception may be different from the class as a whole. 
A more detailed survey of the class was planned to find a better estimate of the income as well as other 
facts about the alumni. The staff developed two methods based on the available funds to carry out the 
survey. 


Method 1: Send out an e-mail to all 6,826 members of the class asking them to complete an online form. 
The staff estimates that at least 600 members will respond. 


Method 2: Select a simple random sample of members of the class and contact the selected members 
directly by phone. Follow up to ensure that all responses are obtained. Because method 2 will 
require more time than method 1, the staff estimates that only 100 members of the class could be 
contacted using method 2. 


Which of the two methods would you select for estimating the average yearly income of all 6,826 members 
of the class of 1988 ? Explain your reasoning by comparing the two methods and the effect of each method 
on the estimate, 
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(b) The director felt the members who attended the reception may be different from the class as a whole. 
A more detailed survey of the class was planned to find a better estimate of the income as well’as other 


facts about the alumni. The staff developed two methods based oh the available funds to carry out the 
survey. 


Method 1: Send out an e-mail to all 6,826 members of the class asking them to complete an online form. 
The staff estimates that at least 600 members will respond. 


Method 2: Select a simple random sample of members of the class and contact the selected members 
directly by phone. Follow up to ensure that all responses are obtained. Because method 2 will 
require more time than method 1, the staff estimates that only 100 members of the class could be 
contacted using method 2. 

Which of the two methods would you select for estimating the average yearly income of all 6,826 members 


of the class of 1988 ? Explain your reasoning by comparing the two methods and the effect of each method 
on the estimate. 
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(b) The director felt the members who attended the reception may be different from the class as a whole. 


A more detailed survey of the class was planned to find a better estimate of the income as well as other 


facts about the alumni. The staff developed two methods based on the available funds to carry out the 
survey. 


Method 1: Send out an e-mail to all 6,826 members of the class asking them to complete an online form. 
The staff estimates that at least 600 members will respond. 
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directly by phone. Follow up to ensure that all responses are obtained. Because method 2 will 


require more time than method 1, the staff estimates that only 100 members of the class could be 
contacted using method 2. 
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Question 4 
Overview 


The primary goals of this question were to assess a student's ability to (1) describe why the median might 
be preferred to the mean in a particular context; (2) compare the relative merits of two sampling plans; and 
(3) describe a consequence of nonresponse in a particular study. 


Sample: 4A 
Score: 4 


In section 1 the response begins by stating that that the distribution of income is probably right skewed, 
which satisfies component 2. This statement is followed by a correct description of how the skewness 
makes the mean greater than the median, which satisfies component 1. The response continues to provide 
other facts about how outliers affect the mean and median, which would also satisfy component 1. 
Because the response correctly describes a statistical advantage of using the median of the reported 
incomes, section 1 was scored as essentially correct. In section 2 the response correctly chooses Method 2, 
identifies a relevant characteristic of Method 1 (“method 1 is voluntary response”), and identifies a relevant 
characteristic of Method 2 (“method 2 is a simple random sample”). Because the response picks the better 
sampling method and addresses relevant characteristics of both methods, section 2 was scored as 
essentially correct. In section 3 the response describes how the incomes of responders are different from 
the incomes of nonresponders (“people with high paying jobs...are more likely...to complete the online 
form”), satisfying component 1. The response also describes the effect of the bias, with direction (“cause 
the estimate from Method 1 to be higher than the true average”), satisfying component 2. Because the 
response correctly describes the effect of nonresponse on the sample and on the estimate, section 3 was 
scored as essentially correct. Because all three sections were scored as essentially correct, the response 
earned a score of 4. 


Sample: 4B 
Score: 3 


In section 1 the response identifies possible low outliers in the distribution of incomes (“alumni who have no 
job or asmall income”), satisfying component 2. In the same sentence, the response also indicates that these 
low outliers could “less affect” the median, satisfying component 1. Because the response correctly 
describes a statistical advantage of using the median of the reported incomes, section 1 was scored as 
essentially correct. In section 2 the response correctly selects Method 2 and addresses both methods with a 
comparison (“reduce bias”), satisfying both components in the first sentence. The response goes on to 
discuss both methods separately, which would also satisfy both components. Because the response picks 
the better sampling method and addresses relevant characteristics of both methods, section 2 was scored 
as essentially correct. In section 3 the response identifies how the incomes of responders might be different 
than the incomes of nonresponders (“alumni who have a low salary may be embarrassed to respond...alumni 
with higher salaries may want to respond”), satisfying component 1. However, the response does not address 
the effect of the bias on the estimate. Saying only that the answer will be “more accurate” does not provide a 
direction. Because the response satisfies only one component, section 3 was scored as partially correct. 
Because two sections were scored as essentially correct and one section was scored as partially correct, the 
response earned a score of 3. 
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Question 4 (continued) 


Sample: 4C 
Score: 2 


In section 1 the response provides a generic statement about how medians are less affected by outliers than 
means, satisfying component 1. However, because the response does not include any conjecture about the 
distribution of incomes, component 2 is not satisfied. Because the response satisfies only one component, 
section 1 was scored as partially correct. In section 2 the response correctly selects Method 2, identifies a 
relevant characteristic of Method 1 (“suffers from nonresponse bias”), and acknowledges the adequate 
sample size for Method 2. However, sample size isn’t considered a relevant characteristic such as random 
sampling or lack of bias. Because the response satisfies only one component, section 2 was scored as 
partially correct. In section 3 the response describes how the incomes of responders are different from the 
incomes of nonresponders (“certain income level alumni may be more likely to respond”), satisfying 
component 1. However, the response does not address the effect of the bias on the estimate. Because the 
response satisfies only one component, section 3 was scored as partially correct. Because three sections were 
scored as partially correct, the response earned a score of 2. 
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